Skip to content

[Feature] Speed up tests by moving external library imports to inside test functions#3908

Open
ParamThakkar123 wants to merge 5 commits into
pytorch:mainfrom
ParamThakkar123:speedup_tests
Open

[Feature] Speed up tests by moving external library imports to inside test functions#3908
ParamThakkar123 wants to merge 5 commits into
pytorch:mainfrom
ParamThakkar123:speedup_tests

Conversation

@ParamThakkar123

Copy link
Copy Markdown
Contributor

Description

Speeds up tests by moving external library imports from module-level to inside test functions. This reduces import time when multiprocessing test workers spawn new Python processes, since heavy libraries (gym, gymnasium, tensorboard, hydra, ray, monarch, functorch) no longer need to be imported at module load time.

Changes

  • All _has_* boolean checks now use lightweight importlib.util.find_spec() instead of try/except ImportError patterns that actually imported the library
  • Actual library imports moved inside test functions, so they only load when a test that needs them actually runs
  • test/libs/conftest.py: Moved import gymnasium/import gym module-level side effects into a session-scoped _setup_gym_backend fixture
  • test/libs/test_datasets.py, test/libs/test_gym.py: Removed module-level import gym/import gymnasium
  • test/test_inference_server.py: Replaced try/except ImportError + import ray/import monarch with find_spec
  • test/test_trainer.py: Replaced try/except ImportError for tensorboard with find_spec; moved TensorboardLogger/event_accumulator imports inside test functions
  • test/test_helpers.py: Replaced try/except ImportError for hydra with find_spec
  • test/modules/_modules_common.py, test/objectives/_objectives_common.py: Replaced try/except ImportError for functorch/vmap with lightweight checks
  • test/objectives/test_ppo.py: Moved make_functional_with_buffers import from module-level to inside test_a2c_diff
  • test/modules/test_rnn.py: Removed module-level conditional from torch import vmap import; moved into the two test functions that use vmap

Related Issue

Closes #657

… test functions

Replace module-level try/except ImportError patterns with lightweight
importlib.util.find_spec() checks, and move actual library imports
(gym, gymnasium, ray, tensorboard, hydra, functorch, etc.) into the
test functions that need them. This reduces import time when
multiprocessing test workers spawn new Python processes.
@pytorch-bot

pytorch-bot Bot commented Jun 24, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3908

Note: Links to docs will display an error until the docs builds have been completed.

⚠️ 16 Awaiting Approval

As of commit ad6939d with merge base b660f05 (image):

AWAITING APPROVAL - The following workflows need approval before CI can run:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 24, 2026
@github-actions github-actions Bot added Environments Adds or modifies an environment wrapper Objectives Modules Trainers Feature New feature labels Jun 24, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 00cc79be vs main b660f05b

Benchmark run: https://github.com/pytorch/rl/actions/runs/28140217886

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 216 benchmarks. Regressions over 5%: 17. Improvements over 5%: 13.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 459.27 2,980 +548.87%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,193 487.88 -84.72%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] 54.06 88.52 +63.75%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td1_return_estimate-False-False] 86.07 54.61 -36.55%
benchmarks/test_objectives_benchmarks.py::test_values[vec_td_lambda_return_estimate-True-False] 86.77 55.35 -36.21%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 86.06 55.57 -35.43%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,433 2,664 -22.40%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,579 2,919 -18.43%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] 24.96 29.05 +16.36%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 353.99 411.61 +16.28%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 475.68 543.25 +14.21%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 33.41 28.99 -13.26%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1,079 938.27 -13.06%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] 23.14 20.16 -12.85%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 869.18 975.93 +12.28%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 121.27 134.63 +11.02%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 88.50 81.25 -8.20%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,917 3,154 +8.11%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 61.64 57.06 -7.43%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,171 2,017 -7.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,245 3,015 -7.09%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-64] 7.1217 6.6351 -6.83%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 272.87 290.77 +6.56%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 51.64 54.98 +6.46%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 4,910 5,223 +6.37%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] 246.13 260.09 +5.67%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 507.46 479.31 -5.55%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] 88.45 83.68 -5.40%
benchmarks/test_envs_benchmark.py::test_transformed 0.8824 0.9282 +5.20%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3905 1.3206 -5.02%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 185.28 176.11 -4.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,122 3,272 +4.83%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 263.40 275.63 +4.64%
benchmarks/test_envs_benchmark.py::test_serial 0.5661 0.5922 +4.60%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 547.92 523.98 -4.37%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 113.03 117.40 +3.87%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 380.76 395.19 +3.79%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,879 20,089 -3.79%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,092 2,013 -3.78%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 764.53 736.19 -3.71%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 38.72 37.32 -3.62%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 367,731 380,827 +3.56%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 399.36 413.06 +3.43%
benchmarks/test_collectors_benchmark.py::test_sync_preempt 16.93 16.36 -3.41%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 255.96 264.35 +3.28%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 1.9401 2.0004 +3.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,102 2,038 -3.05%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 28.50 27.63 -3.04%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 92.18 89.39 -3.03%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 252.64 245.17 -2.96%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,164 7,372 +2.91%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 173.70 168.66 -2.90%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-1] 480.14 466.39 -2.86%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 38,775 37,717 -2.73%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 27,555 26,804 -2.73%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 7,116 7,310 +2.72%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 562.57 548.03 -2.58%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,252 4,362 +2.58%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-64] 4.5467 4.4307 -2.55%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,089 1,061 -2.54%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 64,763 63,135 -2.51%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 0.2316 0.2258 -2.50%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6058 0.5908 -2.48%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-4] 163.29 159.28 -2.46%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 574.86 560.86 -2.44%
benchmarks/test_collectors_benchmark.py::test_single 8.8816 9.0979 +2.44%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-4] 72.19 70.46 -2.39%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-constant] 4,344 4,244 -2.32%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-1] 270.16 276.41 +2.31%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 62.70 61.27 -2.28%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] 124.27 121.48 -2.24%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,916 23,399 -2.16%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 726.48 710.89 -2.15%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 269.85 275.58 +2.12%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-1] 524.53 513.50 -2.10%
benchmarks/test_envs_benchmark.py::test_simple 1.7853 1.8226 +2.09%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 29.43 30.04 +2.08%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 695.82 681.33 -2.08%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-64] 3.0568 2.9951 -2.02%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-4] 146.27 143.37 -1.98%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-16] 18.10 17.75 -1.94%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,744 1,778 +1.92%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 77,850 79,341 +1.91%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-4] 48.75 47.82 -1.90%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 283.99 289.39 +1.90%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 55,873 54,821 -1.88%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 195.66 199.30 +1.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 775.60 790.01 +1.86%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 648.44 660.25 +1.82%
benchmarks/test_envs_benchmark.py::test_parallel 0.9687 0.9512 -1.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 30,794 30,239 -1.80%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 12,407 12,627 +1.78%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-None] 95.42 93.76 -1.74%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 4.0719 4.1411 +1.70%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-16] 36.68 36.07 -1.68%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 42,195 41,490 -1.67%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 25.79 25.36 -1.67%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 53.39 54.27 +1.66%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-16] 43.49 42.79 -1.60%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-64] 12.49 12.69 +1.59%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4439 1.4213 -1.57%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-16] 12.19 12.01 -1.50%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 34,523 34,011 -1.48%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] 89.04 87.73 -1.47%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 38,213 37,653 -1.47%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,792 1,819 +1.46%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-None] 120.98 122.73 +1.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 30,618 30,179 -1.43%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-16] 50.07 49.37 -1.40%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 305.18 309.31 +1.35%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 50,029 49,362 -1.33%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 114.08 115.60 +1.33%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] 244.27 241.02 -1.33%
benchmarks/test_collectors_benchmark.py::test_single_with_rb 8.7257 8.8411 +1.32%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 643.22 651.63 +1.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 836.63 847.51 +1.30%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 15.00 15.19 +1.27%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 45,128 44,556 -1.27%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 48.55 49.15 +1.25%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2180 0.2153 -1.25%
... ... ... Showing 120 of 216 comparisons, sorted by absolute change.

GPU

Compared 226 benchmarks. Regressions over 5%: 12. Improvements over 5%: 17.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 44.03 193.61 +339.74%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 993.52 2,617 +163.45%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 189.28 39.67 -79.04%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,712 3,708 +36.71%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[reduce-overhead-None] 98.99 128.68 +29.99%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,679 3,447 +28.66%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 118.06 84.71 -28.25%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,957 3,705 +25.27%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,624 2,954 -18.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,004 3,490 +16.19%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,694 3,117 +15.68%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 43.17 48.31 +11.91%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,352 2,961 -11.67%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,300 2,039 -11.36%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 819.04 732.43 -10.57%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] 5.4409 4.8908 -10.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,022 2,226 +10.09%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 991.45 898.05 -9.42%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 727.66 789.86 +8.55%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 18.63 20.07 +7.75%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 787.22 844.87 +7.32%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,844 1,958 +6.19%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 45.25 47.99 +6.06%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 4,576 4,308 -5.87%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,241 2,111 -5.80%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 141.74 134.44 -5.15%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 370,320 351,325 -5.13%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,984 2,085 +5.07%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 11.59 12.18 +5.06%
benchmarks/test_envs_benchmark.py::test_simple 1.2440 1.1842 -4.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 47.35 49.61 +4.77%
benchmarks/test_envs_benchmark.py::test_serial 0.4036 0.4229 +4.77%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 50.92 53.34 +4.75%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 495.28 518.10 +4.61%
benchmarks/test_envs_benchmark.py::test_parallel 0.5424 0.5181 -4.49%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] 6.4531 6.7243 +4.20%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 7.7201 7.4108 -4.01%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[100-img_shape0-atari] 16.96 17.63 +3.98%
benchmarks/test_collectors_benchmark.py::test_async 10.80 11.22 +3.94%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 298.06 309.04 +3.68%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] 95.37 98.72 +3.52%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 22,215 22,971 +3.40%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-16] 37.08 35.82 -3.39%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 722.07 746.44 +3.37%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-4] 163.69 158.30 -3.29%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 42.33 43.70 +3.22%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] 364.08 375.69 +3.19%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 6,953 7,170 +3.13%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 32,328 31,320 -3.12%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 149.85 145.27 -3.06%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 11,866 12,224 +3.02%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 182.86 188.32 +2.99%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 6,087 5,906 -2.98%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 48.50 49.91 +2.91%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-64] 7.3155 7.1028 -2.91%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1,323 1,361 +2.90%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 348.96 338.95 -2.87%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] 74.84 76.92 +2.78%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb_cuda[200-img_shape1-large_batch] 8.4749 8.7061 +2.73%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] 8.1446 8.3648 +2.70%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 34,723 33,798 -2.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 52.27 53.66 +2.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.60 22.17 +2.62%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-1] 485.36 472.79 -2.59%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 275.16 268.10 -2.57%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 504.86 517.60 +2.52%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1,245 1,276 +2.50%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[50-img_shape0-small] 873.70 852.11 -2.47%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 22.36 22.91 +2.42%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.57 23.12 +2.42%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 55,946 54,601 -2.40%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 0.2270 0.2216 -2.38%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-1] 643.77 629.00 -2.29%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 63,076 64,506 +2.27%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 339.39 346.99 +2.24%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 18,223 17,829 -2.16%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 891.08 871.83 -2.16%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 21,582 22,045 +2.14%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-1] 188.62 192.62 +2.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 23.26 23.75 +2.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,405 31,721 -2.11%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2145 0.2100 -2.10%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-1] 518.47 507.62 -2.09%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 419.70 410.95 -2.09%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-4] 189.30 185.38 -2.07%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 256.31 261.58 +2.06%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 310.93 317.22 +2.02%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] 16.35 16.68 +2.00%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.2207 5.3236 +1.97%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 125.77 128.24 +1.97%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... 1,477 1,450 -1.86%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 367.41 374.24 +1.86%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-4] 70.89 72.19 +1.83%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-None] 225.58 221.49 -1.82%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 818.64 803.95 -1.80%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-1] 280.27 285.22 +1.77%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 328.95 323.16 -1.76%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1,295 1,318 +1.74%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 825.64 839.97 +1.74%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 237.57 241.69 +1.73%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 56,039 57,007 +1.73%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 23,183 23,570 +1.67%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 405.58 398.98 -1.63%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] 77.12 78.34 +1.59%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 34,590 35,139 +1.59%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 69.77 70.87 +1.59%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-True] 37,947 37,346 -1.58%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,891 1,921 +1.57%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 445.97 452.80 +1.53%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-4] 47.67 48.40 +1.53%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 39,195 38,611 -1.49%
benchmarks/test_envs_benchmark.py::test_transformed 0.7014 0.6910 -1.49%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 504.36 511.80 +1.48%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 104.59 106.10 +1.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,676 19,393 -1.43%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 162.67 160.35 -1.43%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 266.28 270.02 +1.40%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 82.12 83.26 +1.39%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 64,401 63,529 -1.35%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 68.17 69.07 +1.32%
... ... ... Showing 120 of 226 comparisons, sorted by absolute change.

@github-actions github-actions Bot added the CI Has to do with CI setup (e.g. wheels & builds, tests...) label Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Environments Adds or modifies an environment wrapper Feature New feature Modules Objectives Trainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Speed up tests by moving external libraries import to functions in tests

2 participants